Search Result

Select

Outlier detection algorithm based on neighborhood value difference metric

YUAN Zhong, FENG Shan

Journal of Computer Applications 2018, 38 (7): 1905-1909. DOI: 10.11772/j.issn.1001-9081.2017123028

Abstract （893）

PDF （752KB）（394）

Save

Aiming at the problems that symbolic attribute data set can not be processed effectively with traditional distance measure method and numerical attribute data set can not be processed effectively by classical rough set method, an improved method of Neighborhood Value Difference Metric (NVDM) was proposed for outlier detection by utilizing the granulation features of neighborhood rough set. Firstly, with attribute values being normalized, the Neighborhood Information System (NIS) was constructed based on optimized Heterogeneous Euclidian-Overlap Metric (HEOM) and neighborhood radius with adaptive characteristic. Secondly, Neighborhood Outlier Factor (NOF) of data object was constructed based on the NVDM. Finally, a Neighborhood Value Difference Metric-based Outlier Detection (NVDMOD) algorithm was designed and implemented, which improves the traditional unordered one by one model via making full use of the idea of ordered binary and nearest neighbor search in computing Single Attribute Neighborhood Cover (SANC). The NVDMOD algorithm was analyzed and compared with existing outlier detection algorithms including NEighborhood outlier Detection (NED) algorithm, DIStance-based outlier detection (DIS) algorithm and K-Nearest Neighbor ( KNN) algorithm on UCI standard data sets. The experimental results show that NVDMOD algorithm has much higher adaptability and effectiveness, and it provides a more effective new method for outlier detection of mixed attribute data sets.

Reference | Related Articles | Metrics

Select

Two-level confidence threshold setting method for positive and negative association rules

CHEN Liu, FENG Shan

Journal of Computer Applications 2018, 38 (5): 1315-1319. DOI: 10.11772/j.issn.1001-9081.2017102469

Abstract （630）

PDF （873KB）（329）

Save

Aiming at the problem that traditional confidence threshold setting methods for positive and negative association rules are difficult to limit the number of low-reliability rules and easy to miss some interesting association rules, a new two-level confidence threshold setting method combined with the rule's itemset correlation was proposed, called PNMC-TWO. Firstly, taking into account the consistency, validity and interestingness of rules, under the framework of correlation-support-confidence, on the basis of the computation relationship between rule confidence and itemset support of the rule, the law of confidence of rule changing with support of itemsets of the rule was analyzed systematically. And then, combined with the user's requirement of high confidence and interesting rules in actual mining, a new confidence threshold setting model was proposed to avoid the blindness and randomness of the traditional methods when setting the threshold. Finally, the proposed method was compared with the original two-threshold method in terms of the quantity and quality of the rule. The experimental results show that the new two-level threshold method not only can ensure that the extracted association rules are more effective and interesting, but also can reduce the number of low-reliability rules significantly.

Reference | Related Articles | Metrics

Select

Algorithm for lifting temporal consistency QoS improvement of real-time data objects based on deferrable scheduling

YU Ge, FENG Shan

Journal of Computer Applications 2016, 36 (6): 1645-1649. DOI: 10.11772/j.issn.1001-9081.2016.06.1645

Abstract （494）

PDF （709KB）（352）

Save

Concerning the application problem of the existing scheduling algorithms for guaranteeing the temporal consistency of real-time data objects in the soft real-time database system environment, a Statistical Deferrable Scheduling-OPTimization (SDS-OPT)algorithm was proposed. At first, the characteristics and shortcomings of the existed algorithms were analyzed and compared in terms of scheduling, Quality of Service (QoS) and workload, then the necessity of optimizing the existing algorithms was pointed out. Secondly, in order to maximize QoS of temporal consistency for real-time data objects by advancing the schedulable job quantity of real-time updating transactions, the steepest descend method was used to increase the reference value of the screening benchmark for job execution time. Finally, the proposed algorithm was compared with the existing algorithms in terms of workload and QoS. The experimental results show that, compared with the Deferrable Scheduling algorithm for Fixed Priority transactions (DS-FP) and Deferring Scheduling-Probability Statistic algorithm (DS-PS), the proposed optimization algorithm can guarantee temporal consistency of real-time data objects effectively and reduce the workload, while the QoS is improved significantly.

Reference | Related Articles | Metrics